153-2008: SAS/OR®: Rigorous Constrained Optimized Binning for Credit Scoring

نویسندگان

  • Ivan Oliveira
  • Manoj Chari
  • Susan Haller
چکیده

Credit scoring can be defined as a statistical modeling technique used to assign risk to credit applicants or to existing credit accounts. We present a new process that enhances the formulation and solution approach in the SAS® system during the so-called “binning” phase by exploiting SAS/OR optimization capabilities to approach the problem from a mathematically rigorous perspective. Usually, attributes such as age, income, and so on, are segmented into grouping intervals, with the aim of creating bins for scorecards that maximize correlation with these attributes. A critical aspect of the binning process is the enforcement of various constraints, such as minimum/maximum number of bins, minimum/maximum bin widths, and maximum number of observations per bin. These requirements significantly complicate the binning process. Our approach is rigorous in the sense that global linear constraints are implemented exactly with the use of mixed-integer linear programming. Furthermore, the methodology can be extended to a fully rigorous approach (which incorporates an additional nonlinear constraint of imposed monotonicity, at the cost of computational time) within the same mathematical programming context by the addition of variables and constraints related to “weight of evidence” (WOE). The key enabling factor is the mathematical programming model structure and careful implementation of constraints. We present some results of the binning process proposed here and related scorecards, and we compare the solutions against the current state-of-the-art approach. The methodology presented here will be available through SAS® Enterprise MinerTM for Credit Scoring in a near-future release.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Necessary Condition for a Good Binning Algorithm in Credit Scoring

Binning is a categorization process to transform a continuous variable into a small set of groups or bins. Binning is widely used in credit scoring. In particular, it can be used to define the Weight of Evidence (WOE) transformation. In this paper, we first derive an explicit solution to a logistic regression model with one independent variable that has undergone a WOE transformation. We then u...

متن کامل

Paper 1323-2017: Real AdaBoost: Boosting for Credit Scorecards and Similarity to WOE Logistic Regression

Adaboost is a machine learning algorithm that builds a series of small decision trees, adapting each tree to predict difficult cases missed by the previous trees and combining all trees into a single model. We will discuss the AdaBoost methodology and introduce the extension called Real AdaBoost. Real AdaBoost comes from a strong academic pedigree: its authors are pioneers of machine learning a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008